Author: Rebecca Vandewalle rcv3@illinois.edu
Created: 5-19-21
This notebook demonstrates how to preform Geographically Weighted Regression using the MGWR Python package using sample code included in Oshan et al. 2019. MGWR: A Python Implementation of Multiscale Geographically Weighted Regression for Investigating Process Spatial Heterogeneity and Scale. ISPRS Int. J. Geo-Inf. 2019, 8(6), 269; https://doi.org/10.3390/ijgi8060269.
try:
from mgwr.gwr import GWR
except:
print('Installing MGWR')
! pip install -U mgwr
These packages are required to run the sample code.
# code in this cell is from Oshan et al. 2019
import numpy as np
import pandas as pd
import libpysal as ps
from mgwr.gwr import GWR, MGWR
from mgwr.sel_bw import Sel_BW
from mgwr.utils import compare_surfaces, truncate_colormap
import geopandas as gp
import matplotlib.pyplot as plt
import matplotlib as mpl
Here data for counties in Georgia is loaded from an example dataset. The base data is mapped using matplotlib.
# import Georgia dataset and show counties
# code in this cell is from Oshan et al. 2019
georgia = gp.read_file(ps.examples.get_path('G_utm.shp'))
fig, ax = plt.subplots(figsize = (10, 10))
georgia.plot(ax=ax, **{'edgecolor': 'black', 'facecolor': 'white'})
georgia.centroid.plot(ax = ax, c = 'black')
plt.show()
In the next cells, the data formats used for the example are illustrated.
# inspect georgia data type
type(georgia)
# inspect base data structure
georgia.head()
# inspect centroids
georgia.centroid.head()
Data is transformed into the expected format for loading into the GWR model.
# process input data for GWR
# code in this cell is from Oshan et al. 2019
g_y = georgia['PctBach'].values.reshape((-1, 1))
g_X = georgia[['PctFB', 'PctBlack', 'PctRural']].values
u = georgia['X']
v = georgia['Y']
g_coords = list(zip(u, v))
# inspect data contents
print('g_y:\n', g_y[:5])
print('\ng_X:\n', g_X[:5])
print('\nu:\n', list(u[:5]))
print('\nv:\n', list(v[:5]))
print('\ng_coords:\n', g_coords[:5], "\n")
Here the model bandwidth is determined computationally.
# select bandwidth computationally
# code in this cell is from Oshan et al. 2019
gwr_selector = Sel_BW(g_coords, g_y, g_X)
gwr_bw = gwr_selector.search()
print(gwr_bw)
Here the model is constructed and fitted to the input.
# fit the model
# code in this cell is from Oshan et al. 2019
gwr_model = GWR(g_coords, g_y, g_X, gwr_bw)
gwr_results = gwr_model.fit()
print(gwr_results.resid_ss)
The model is fitted several times with different bandwidths, and differences are plotted.
# visualize effects of changing bandwidth
# code in this cell is from Oshan et al. 2019
fig, ax = plt.subplots(2, 3, figsize = (10, 6))
bws = (x for x in range(25, 175, 25))
vmins = []
vmaxs = []
for row in range(2):
for col in range(3):
bw = next(bws)
gwr_model = GWR(g_coords, g_y, g_X, bw)
gwr_results = gwr_model.fit()
georgia['rural'] = gwr_results.params[:, -1]
georgia.plot('rural', ax = ax[row, col])
ax[row,col].set_title('Bandwidth: ' + str(bw))
ax[row,col].get_xaxis().set_visible(False)
ax[row,col].get_yaxis().set_visible(False)
vmins.append(georgia['rural'].min())
vmaxs.append(georgia['rural'].max())
sm = plt.cm.ScalarMappable(norm=plt.Normalize(vmin=min(vmins), vmax=max(vmaxs)))
fig.tight_layout()
fig.subplots_adjust(right=0.9)
cax = fig.add_axes([0.92, 0.14, 0.03, 0.75])
sm._A = []
cbar = fig.colorbar(sm, cax=cax)
cbar.ax.tick_params(labelsize=10)
plt.show()
Finally, measures of global and local fit are computed.
# assess global model fit
# code in this cell is from Oshan et al. 2019
gwr_selector = Sel_BW(g_coords, g_y, g_X)
gwr_bw = gwr_selector.search()
gwr_model = GWR(g_coords, g_y, g_X, gwr_bw)
gwr_results = gwr_model.fit()
print(gwr_results.aic)
print(gwr_results.aicc)
print(gwr_results.R2)
# assess local model fit
# code in this cell is from Oshan et al. 2019
georgia['R2'] = gwr_results.localR2
georgia.plot('R2', legend = True)
ax = plt.gca()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()